Careers

←Job Openings

BigData/Hadoop developer – Tamp,FL Refer a Person Apply for this Job

Job Description

  • Develop the Big Data applications by following Agile Software development life cycle for fast and efficient progress.
  • Study various upcoming tools and work on proof of concepts for leveraging those applications in the current system.
  • Write software to ingest data into Hadoop and write scalable and maintainable Extract, Transformation and Load jobs.
  • Create Scala/Spark jobs for data transformation and aggregation with focus on functional programming paradigm.
  • Build distributed, reliable and scalable data pipelines to ingest and process data in real-time. Deal with fetching impression streams, transaction behaviors, clickstream data and other unstructured data.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Develop shell scripts and automated data management from end to end integration work.
  • Develop Oozie workflow for scheduling and orchestrating the ETL process and worked on Oozie workflow engine for job scheduling.
  • Created environment to access Loaded Data via spark SQL, through JDBC&ODBC (via Spark Thrift Server).
  • Developed real time data ingestion/ analysis using Kafka/ Spark-streaming.
  • Design and Implementation of Backup and Disaster Recovery strategy based out of Cloudera BDR utility for Batch applications and Kafka mirror maker for real-time streaming applications.
  • Implement Hadoop Security like Kerberos, Cloudera Key Trustee Server and Key Trustee Management Systems.
  • Use Hive join queries to join multiple tables of a source system and load them to Elastic search tables.
  • Work on Cassandra and Query Grid. Implementing shell scripts to move data from Relational Database to HDFS (Hadoop Distributed File System) and vice versa.
  • Optimize and Performance tuning of the cluster by changing the parameters based on the benchmarking results.
  • Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.
  • Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.
  • Translate, load and exhibit unrelated data sets in various formats and sources like JSON, text files, Kafka queues, and log data.
  • Work on writing complex workflow jobs using Oozie and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.

Required Skills:

  • A minimum of bachelor's degree in computer science or equivalent.
  • Cloudrea Hadoop(CDH), Cloudera Manager, Informatica Bigdata Edition(BDM), HDFS, Yarn, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata Studio Express, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell Scripting, Eclipse IDE, Git, SVN
  • Must have strong problem-solving and analytical skills
  • Must have the ability to identify complex problems and review related information to develop and evaluate options and implement solutions.

If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Send resumes. to HSTechnologies LLC, 2801 W Parker Road, Suite #5 Plano, TX - 75023 or email your resume to hr@sbhstech.com.